Biostatistics For Dummies, 2nd Edition (Monika Wahi, John Pezzullo)

CHAPTER 23 Survival Regression 343

The baseline survival function’s table may have hundreds of rows for large data

sets, so instead of printing it, you should save the table as a data file. Then, you

can use it to generate a customized prognosis curve (described in the next section)

for any specific set of values for the predictor variables.

The software may also offer a graph of the baseline survival function. If your soft-

ware is using an average-participant baseline (see the earlier section, “The steps

to perform a PH regression”), this graph is useful as an indicator of the entire

group’s overall survival. But if your software uses a zero-participant baseline, the

curve is not helpful.

How Long Have I Got, Doc? Constructing

Prognosis Curves

A primary reason to use regression analysis is to predict outcomes from any par-

ticular set of predictor values. For survival analysis, you can use the regression

coefficients from a PH regression along with the baseline survival curve to con-

struct an expected survival (prognosis) curve for any set of predictor values.

Suppose that you’re survival time (from diagnosis to death) for a group of cancer

patients in which the predictors are age, tumor stage, and tumor grade at the time

of diagnosis. You’d run a PH regression on your data and have the program gen-

erate the baseline survival curve as a table of times and survival probabilities.

After that, whenever a patient is newly diagnosed with cancer, you can take that

person’s age, stage, and grade, and generate an expected survival curve tailored

for that particular patient. (The patient may not want to see it, but at least it could

be done.)

You’ll probably have to do these calculations outside of the software that you use

for the survival regression, but the calculations aren’t difficult and can be done in

a Microsoft Excel spreadsheet. The example in the following sections uses the

small set of sample data that’s preloaded into the online calculator for PH regres-

sion at https://statpages.info/prophaz.html. This particular example has

only one predictor, but the basic idea extends to multiple predictors.

Obtaining the necessary output

Figure 23-6 shows the output from the built-in example (omitting the Iteration

History and Overall Model Fit sections). Pretend that this model represents sur-

vival, in years, as a function of age for patients just diagnosed with some partic-

ular disease. In the output, the age variable is called Variable 1.